159 research outputs found

    A critical evaluation of network and pathway based classifiers for outcome prediction in breast cancer

    Get PDF
    Recently, several classifiers that combine primary tumor data, like gene expression data, and secondary data sources, such as protein-protein interaction networks, have been proposed for predicting outcome in breast cancer. In these approaches, new composite features are typically constructed by aggregating the expression levels of several genes. The secondary data sources are employed to guide this aggregation. Although many studies claim that these approaches improve classification performance over single gene classifiers, the gain in performance is difficult to assess. This stems mainly from the fact that different breast cancer data sets and validation procedures are employed to assess the performance. Here we address these issues by employing a large cohort of six breast cancer data sets as benchmark set and by performing an unbiased evaluation of the classification accuracies of the different approaches. Contrary to previous claims, we find that composite feature classifiers do not outperform simple single gene classifiers. We investigate the effect of (1) the number of selected features; (2) the specific gene set from which features are selected; (3) the size of the training set and (4) the heterogeneity of the data set on the performance of composite feature and single gene classifiers. Strikingly, we find that randomization of secondary data sources, which destroys all biological information in these sources, does not result in a deterioration in performance of composite feature classifiers. Finally, we show that when a proper correction for gene set size is performed, the stability of single gene sets is similar to the stability of composite feature sets. Based on these results there is currently no reason to prefer prognostic classifiers based on composite features over single gene classifiers for predicting outcome in breast cancer

    MetaPath: identifying differentially abundant metabolic pathways in metagenomic datasets

    Get PDF
    Enabled by rapid advances in sequencing technology, metagenomic studies aim to characterize entire communities of microbes bypassing the need for culturing individual bacterial members. One major goal of metagenomic studies is to identify specific functional adaptations of microbial communities to their habitats. The functional profile and the abundances for a sample can be estimated by mapping metagenomic sequences to the global metabolic network consisting of thousands of molecular reactions. Here we describe a powerful analytical method (MetaPath) that can identify differentially abundant pathways in metagenomic datasets, relying on a combination of metagenomic sequence data and prior metabolic pathway knowledge. First, we introduce a scoring function for an arbitrary subnetwork and find the max-weight subnetwork in the global network by a greedy search algorithm. Then we compute two p values (p abund and p struct ) using nonparametric approaches to answer two different statistical questions: (1) is this subnetwork differentically abundant? (2) What is the probability of finding such good subnetworks by chance given the data and network structure? Finally, significant metabolic subnetworks are discovered based on these two p values. In order to validate our methods, we have designed a simulated metabolic pathways dataset and show that MetaPath outperforms other commonly used approaches. We also demonstrate the power of our methods in analyzing two publicly available metagenomic datasets, and show that the subnetworks identified by MetaPath provide valuable insights into the biological activities of the microbiome. We have introduced a statistical method for finding significant metabolic subnetworks from metagenomic datasets. Compared with previous methods, results from MetaPath are more robust against noise in the data, and have significantly higher sensitivity and specificity (when tested on simulated datasets). When applied to two publicly available metagenomic datasets, the output of MetaPath is consistent with previous observations and also provides several new insights into the metabolic activity of the gut microbiome. The software is freely available at http://metapath.cbcb.umd.edu .https://doi.org/10.1186/1753-6561-5-S2-S

    Identification of differentially expressed subnetworks based on multivariate ANOVA

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Since high-throughput protein-protein interaction (PPI) data has recently become available for humans, there has been a growing interest in combining PPI data with other genome-wide data. In particular, the identification of phenotype-related PPI subnetworks using gene expression data has been of great concern. Successful integration for the identification of significant subnetworks requires the use of a search algorithm with a proper scoring method. Here we propose a multivariate analysis of variance (MANOVA)-based scoring method with a greedy search for identifying differentially expressed PPI subnetworks.</p> <p>Results</p> <p>Given the MANOVA-based scoring method, we performed a greedy search to identify the subnetworks with the maximum scores in the PPI network. Our approach was successfully applied to human microarray datasets. Each identified subnetwork was annotated with the Gene Ontology (GO) term, resulting in the phenotype-related functional pathway or complex. We also compared these results with those of other scoring methods such as <it>t </it>statistic- and mutual information-based scoring methods. The MANOVA-based method produced subnetworks with a larger number of proteins than the other methods. Furthermore, the subnetworks identified by the MANOVA-based method tended to consist of highly correlated proteins.</p> <p>Conclusion</p> <p>This article proposes a MANOVA-based scoring method to combine PPI data with expression data using a greedy search. This method is recommended for the highly sensitive detection of large subnetworks.</p

    Network-based functional enrichment

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Many methods have been developed to infer and reason about molecular interaction networks. These approaches often yield networks with hundreds or thousands of nodes and up to an order of magnitude more edges. It is often desirable to summarize the biological information in such networks. A very common approach is to use gene function enrichment analysis for this task. A major drawback of this method is that it ignores information about the edges in the network being analyzed, i.e., it treats the network simply as a set of genes. In this paper, we introduce a novel method for functional enrichment that explicitly takes network interactions into account.</p> <p>Results</p> <p>Our approach naturally generalizes Fisher’s exact test, a gene set-based technique. Given a function of interest, we compute the subgraph of the network induced by genes annotated to this function. We use the sequence of sizes of the connected components of this sub-network to estimate its connectivity. We estimate the statistical significance of the connectivity empirically by a permutation test. We present three applications of our method: i) determine which functions are enriched in a given network, ii) given a network and an interesting sub-network of genes within that network, determine which functions are enriched in the sub-network, and iii) given two networks, determine the functions for which the connectivity improves when we merge the second network into the first. Through these applications, we show that our approach is a natural alternative to network clustering algorithms.</p> <p>Conclusions</p> <p>We presented a novel approach to functional enrichment that takes into account the pairwise relationships among genes annotated by a particular function. Each of the three applications discovers highly relevant functions. We used our methods to study biological data from three different organisms. Our results demonstrate the wide applicability of our methods. Our algorithms are implemented in C++ and are freely available under the GNU General Public License at our supplementary website. Additionally, all our input data and results are available at <url>http://bioinformatics.cs.vt.edu/~murali/supplements/2011-incob-nbe/</url>.</p

    Systems biology of platelet-vessel wall interactions

    Get PDF
    Platelets are small, anucleated cells that participate in primary hemostasis by forming a hemostatic plug at the site of a blood vessel's breach, preventing blood loss. However, hemostatic events can lead to excessive thrombosis, resulting in life-threatening strokes, emboli, or infarction. Development of multi-scale models coupling processes at several scales and running predictive model simulations on powerful computer clusters can help interdisciplinary groups of researchers to suggest and test new patient-specific treatment strategies

    Current understanding of the relationship between cervical manipulation and stroke: what does it mean for the chiropractic profession?

    Get PDF
    The understanding of the relationship between cervical manipulative therapy (CMT) and vertebral artery dissection and stroke (VADS) has evolved considerably over the years. In the beginning the relationship was seen as simple cause-effect, in which CMT was seen to cause VADS in certain susceptible individuals. This was perceived as extremely rare by chiropractic physicians, but as far more common by neurologists and others. Recent evidence has clarified the relationship considerably, and suggests that the relationship is not causal, but that patients with VADS often have initial symptoms which cause them to seek care from a chiropractic physician and have a stroke some time after, independent of the chiropractic visit

    Phase I/II study of oral etoposide plus GM-CSF as second-line chemotherapy in platinum-pretreated patients with advanced ovarian cancer

    Get PDF
    The aim of this phase I/II study was to determine the maximum tolerated dose (MTD) and the dose-limiting toxicities of chronic oral etoposide given on days 1–10 followed by rescue with subcutaneous (s.c.) granulocyte-macrophage colony-stimulating factor (GM-CSF) on days 12–19 as second-line chemotherapy in platinum-pretreated patients (pts) with advanced ovarian carcinoma. Cohorts of three to six pts were treated with doses of oral etoposide from 750 mg m−2 cycle−1 escalated to 1250 mg m−2 cycle−1 over 10 days, every 3 weeks. Subcutanous GM-CSF, 400 μg once daily, days 12–19, was added if dose-limiting granulocytopenia was encountered. In total, 18 pts with a median Karnofsky index of 80% (range, 70–100%) and a median time elapsed since the last platinum dose of 10 months (range, 1–24 months), 30% of whom showed visceral metastases, were treated at four dose levels (DLs) of oral etoposide on days 1–10 of each cycle as follows: DL 1, 750 mg m−2 cycle−1, without GM-CSF, three pts; DL 2, 1000 mg m−2 cycle−1, without GM-CSF, three pts; DL 3, 1000 mg m−2 cycle−1, with GM-CSF, six pts; and DL 4, 1250 mg m−2 cycle−1, with GM-CSF, six pts. All pts were assessable for toxicity and 16 pts for response. Dose-limiting toxicity (DLT) was reached at DL 4 by three of six pts, showing World Health Organization (WHO) toxicity grade 4. One patient died from gram-negative sepsis associated with granulocytopenia grade 4. Two more pts developed uncomplicated granulocytopenia grade 4. Thus, we recommend that DL 3 can be used for further phase II evaluation (i.e. oral etoposide 1000 mg m−2 cycle−1, days 1–10, followed by s.c. GM-CSF 400 μg, days 12–19). The clinical complete or partial responses in each patient cohort were: DL 1, one of three pts; DL 2, one of three pts; DL 3, three of five pts; and DL 4, two of five pts. In conclusion, in this phase I/II study, we defined the MTD and the dose recommended for the therapy with oral etoposide given over 10 days followed by s.c. GM-CSF in platinum-pretreated patients with advanced ovarian cancer. Our data demonstrate encouraging activity of this regimen and strongly support its further investigation in a phase II study

    Interpreting Metabolomic Profiles using Unbiased Pathway Models

    Get PDF
    Human disease is heterogeneous, with similar disease phenotypes resulting from distinct combinations of genetic and environmental factors. Small-molecule profiling can address disease heterogeneity by evaluating the underlying biologic state of individuals through non-invasive interrogation of plasma metabolite levels. We analyzed metabolite profiles from an oral glucose tolerance test (OGTT) in 50 individuals, 25 with normal (NGT) and 25 with impaired glucose tolerance (IGT). Our focus was to elucidate underlying biologic processes. Although we initially found little overlap between changed metabolites and preconceived definitions of metabolic pathways, the use of unbiased network approaches identified significant concerted changes. Specifically, we derived a metabolic network with edges drawn between reactant and product nodes in individual reactions and between all substrates of individual enzymes and transporters. We searched for “active modules”—regions of the metabolic network enriched for changes in metabolite levels. Active modules identified relationships among changed metabolites and highlighted the importance of specific solute carriers in metabolite profiles. Furthermore, hierarchical clustering and principal component analysis demonstrated that changed metabolites in OGTT naturally grouped according to the activities of the System A and L amino acid transporters, the osmolyte carrier SLC6A12, and the mitochondrial aspartate-glutamate transporter SLC25A13. Comparison between NGT and IGT groups supported blunted glucose- and/or insulin-stimulated activities in the IGT group. Using unbiased pathway models, we offer evidence supporting the important role of solute carriers in the physiologic response to glucose challenge and conclude that carrier activities are reflected in individual metabolite profiles of perturbation experiments. Given the involvement of transporters in human disease, metabolite profiling may contribute to improved disease classification via the interrogation of specific transporter activities

    Proteome changes in platelets activated by arachidonic acid, collagen, and thrombin

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Platelets are small anucleated blood particles that play a key role in the control of bleeding. Platelets need to be activated to perform their functions and participate in hemostasis. The process of activation is accompanied by vast protein reorganization and posttranslational modifications. The goal of this study was to identify changes in proteins in platelets activated by different agonists. Platelets were activated by three different agonists - arachidonic acid, collagen, and thrombin. 2D SDS-PAGE (pI 4-7) was used to separate platelet proteins. Proteomes of activated and resting platelets were compared with each other by Progenesis SameSpots statistical software; and proteins were identified by nanoLC-MS/MS.</p> <p>Results</p> <p>190 spots were found to be significantly different. Of these, 180 spots were successfully identified and correspond to 144 different proteins. Five proteins were found that had not previously been identified in platelets: protein CDV3 homolog, protein ETHE1, protein LZIC, FGFR1 oncogene partner 2, and guanine nucleotide-binding protein subunit beta-5. Using spot expression profile analysis, we found two proteins (WD repeat-containing protein 1 and mitochondrial glycerol-3-phosphate dehydrogenase) that may be part of thrombin specific activation or signal transduction pathway(s).</p> <p>Conclusions</p> <p>Our results, characterizing the differences within proteins in both activated (by various agonists) and resting platelets, can thus contribute to the basic knowledge of platelets and to the understanding of the function and development of new antiplatelet drugs.</p
    corecore